fix: implement batch generation for OpenAI models#1846
Open
ianliuy wants to merge 1 commit intodottxt-ai:mainfrom
Open
fix: implement batch generation for OpenAI models#1846ianliuy wants to merge 1 commit intodottxt-ai:mainfrom
ianliuy wants to merge 1 commit intodottxt-ai:mainfrom
Conversation
Replace NotImplementedError stubs in OpenAI.generate_batch() and AsyncOpenAI.generate_batch() with working implementations: - Sync: loops over inputs calling self.generate() for each prompt - Async: uses asyncio.gather() for concurrent execution Update tests from asserting NotImplementedError to verifying correct batch results using mock clients. Fixes dottxt-ai#1391 Signed-off-by: Yiyang Liu <37043548+ianliuy@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's broken?
Calling the generator with a list of prompts (vectorized/batched calls) crashes for OpenAI models. For example:
Who is affected?
Any user trying to batch multiple prompts with OpenAI (or AsyncOpenAI) models. Single-prompt calls work fine. This was originally reported against v0.1.13, and while the codebase has been completely restructured for v1.0, the underlying user need (batched generation) is still unmet —
generate_batch()raisesNotImplementedError.When does it trigger?
Every time
model.batch([...])orgenerator.batch([...])is called with an OpenAI model. 100% reproducible.Where is the bug?
outlines/models/openai.pylines 295-303:OpenAI.generate_batch()raisesNotImplementedErroroutlines/models/openai.pylines 445-453:AsyncOpenAI.generate_batch()raisesNotImplementedErrorWhy does it happen?
When batch support was added to the model hierarchy (commit
c350ddc9), models with native batch APIs (Transformers, vLLM) got real implementations, while API-based models (OpenAI, Anthropic, etc.) gotNotImplementedErrorstubs since their APIs don't support multi-prompt-in-one-call.However, batch can be trivially implemented for API models by looping over individual
generate()calls — each prompt becomes a separate API request. The async variant can useasyncio.gather()for concurrent execution.How did we fix it?
Replaced the
NotImplementedErrorstubs inOpenAI.generate_batch()andAsyncOpenAI.generate_batch()with implementations that:self.generate()for each promptasyncio.gather()to fire all prompts concurrentlyThis is a minimal, surgical change (~10 lines per method) that follows the existing
generate()contract — input formatting, output type handling, and error handling are all delegated to the already-testedgenerate()method.The same pattern could be applied to other API models (Anthropic, Gemini, Mistral, etc.) as a follow-up.
How do we know it works?
test_openai_batchandtest_openai_async_batchfrom assertingNotImplementedErrorto verifying correct batch results using mock clientscc @RobinPicard @cpfiffer